170 research outputs found
Image to Image Translation for Domain Adaptation
We propose a general framework for unsupervised domain adaptation, which
allows deep neural networks trained on a source domain to be tested on a
different target domain without requiring any training annotations in the
target domain. This is achieved by adding extra networks and losses that help
regularize the features extracted by the backbone encoder network. To this end
we propose the novel use of the recently proposed unpaired image-toimage
translation framework to constrain the features extracted by the encoder
network. Specifically, we require that the features extracted are able to
reconstruct the images in both domains. In addition we require that the
distribution of features extracted from images in the two domains are
indistinguishable. Many recent works can be seen as specific cases of our
general framework. We apply our method for domain adaptation between MNIST,
USPS, and SVHN datasets, and Amazon, Webcam and DSLR Office datasets in
classification tasks, and also between GTA5 and Cityscapes datasets for a
segmentation task. We demonstrate state of the art performance on each of these
datasets
Reinforcement learning for freeform robot design
Inspired by the necessity of morphological adaptation in animals, a growing
body of work has attempted to expand robot training to encompass physical
aspects of a robot's design. However, reinforcement learning methods capable of
optimizing the 3D morphology of a robot have been restricted to reorienting or
resizing the limbs of a predetermined and static topological genus. Here we
show policy gradients for designing freeform robots with arbitrary external and
internal structure. This is achieved through actions that deposit or remove
bundles of atomic building blocks to form higher-level nonparametric
macrostructures such as appendages, organs and cavities. Although results are
provided for open loop control only, we discuss how this method could be
adapted for closed loop control and sim2real transfer to physical machines in
future
Beyond pairwise clustering
We consider the problem of clustering in domains where the affinity relations are not dyadic (pairwise), but rather triadic, tetradic or higher. The problem is an instance of the hypergraph partitioning problem. We propose a two-step algorithm for solving this problem. In the first step we use a novel scheme to approximate the hypergraph using a weighted graph. In the second step a spectral partitioning algorithm is used to partition the vertices of this graph. The algorithm is capable of handling hyperedges of all orders including order two, thus incorporating information of all orders simultaneously. We present a theoretical analysis that relates our algorithm to an existing hypergraph partitioning algorithm and explain the reasons for its superior performance. We report the performance of our algorithm on a variety of computer vision problems and compare it to several existing hypergraph partitioning algorithms
Efficient automatic design of robots
Robots are notoriously difficult to design because of complex
interdependencies between their physical structure, sensory and motor layouts,
and behavior. Despite this, almost every detail of every robot built to date
has been manually determined by a human designer after several months or years
of iterative ideation, prototyping, and testing. Inspired by evolutionary
design in nature, the automated design of robots using evolutionary algorithms
has been attempted for two decades, but it too remains inefficient: days of
supercomputing are required to design robots in simulation that, when
manufactured, exhibit desired behavior. Here we show for the first time de-novo
optimization of a robot's structure to exhibit a desired behavior, within
seconds on a single consumer-grade computer, and the manufactured robot's
retention of that behavior. Unlike other gradient-based robot design methods,
this algorithm does not presuppose any particular anatomical form; starting
instead from a randomly-generated apodous body plan, it consistently discovers
legged locomotion, the most efficient known form of terrestrial movement. If
combined with automated fabrication and scaled up to more challenging tasks,
this advance promises near instantaneous design, manufacture, and deployment of
unique and useful machines for medical, environmental, vehicular, and
space-based tasks
Detecting the Starting Frame of Actions in Video
In this work, we address the problem of precisely localizing key frames of an
action, for example, the precise time that a pitcher releases a baseball, or
the precise time that a crowd begins to applaud. Key frame localization is a
largely overlooked and important action-recognition problem, for example in the
field of neuroscience, in which we would like to understand the neural activity
that produces the start of a bout of an action. To address this problem, we
introduce a novel structured loss function that properly weights the types of
errors that matter in such applications: it more heavily penalizes extra and
missed action start detections over small misalignments. Our structured loss is
based on the best matching between predicted and labeled action starts. We
train recurrent neural networks (RNNs) to minimize differentiable
approximations of this loss. To evaluate these methods, we introduce the Mouse
Reach Dataset, a large, annotated video dataset of mice performing a sequence
of actions. The dataset was collected and labeled by experts for the purpose of
neuroscience research. On this dataset, we demonstrate that our method
outperforms related approaches and baseline methods using an unstructured loss
- …